-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(catalog): add support for redacting sensitive connector properties #14193
Conversation
7d01bee
to
609503c
Compare
609503c
to
8c46491
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't took a look yet. QQ: Why did you use a trait, instead of the safer (IMO) approach like #13034?
@@ -33,8 +34,10 @@ pub struct S3Properties { | |||
#[serde(rename = "match_pattern", default)] | |||
pub match_pattern: Option<String>, | |||
#[serde(rename = "s3.credentials.access", default)] | |||
#[mark_redaction(rename = "s3.credentials.access")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
access key seems not that sensitive. cc @hzxa21
Because to redact connector properties from a This PR implements See |
If we have... impl serde::Serializer for Serializer {
type Ok = Vec<SqlOption>;
..
} then wrap secrets with The most tricky part I can think of is to escape back the unescaped |
@@ -33,8 +34,10 @@ pub struct S3Properties { | |||
#[serde(rename = "match_pattern", default)] | |||
pub match_pattern: Option<String>, | |||
#[serde(rename = "s3.credentials.access", default)] | |||
#[mark_redaction(rename = "s3.credentials.access")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reuse the serde
attributes for mark_redaction
to avoid duplication?
let mut stmt = stmt.clone(); | ||
let sql_options = match &mut stmt { | ||
Statement::CreateSource { stmt } => &mut stmt.with_properties.0, | ||
Statement::CreateTable { with_options, .. } => with_options, | ||
Statement::CreateSink { stmt } => &mut stmt.with_properties.0, | ||
_ => { | ||
return Some(stmt); | ||
} | ||
}; | ||
let Ok(mut with_properties) = | ||
WithOptions::try_from(sql_options.as_slice()).map(WithOptions::into_inner) | ||
else { | ||
return None; | ||
}; | ||
let Some(connector) = with_properties.remove(UPSTREAM_SOURCE_KEY) else { | ||
tracing::error!("Must specify 'connector' in WITH clause"); | ||
return None; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about finding a way to unify this snippet with ...
(Sorry for the late review.) Oh I get it. The type-safe approach is good for redacting the Property structs themselves (e.g., |
fn redact_marked_field(options: &mut Vec<SqlOption>, marks: std::collections::HashSet<String>) { | ||
for option in options { | ||
if marks.contains(&option.name.real_value()) { | ||
option.value = Value::SingleQuotedString("[REDACTED]".into()); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking do we really need to get the keys to redact precisely? What about simply pass ["secret", "credential", "key"]
etc. ? 🤣
Bugen's idea on relying on serde
is also acceptable.
Can you give an example of how this can go wrong? 👀 |
I see the alternative approach in #14310, this is quite a simple solution. 🤣 impl Masking for Vec<SqlOption> {
fn mask<'a>(&mut self, _context: &mut MaskingContext) {
for sql_option in self {
sql_option.value = Value::SingleQuotedString("[REDACTED]".to_string());
}
}
} |
Will this be too aggressive? For example, |
It's us that convert it to type-unsafe maps. 😕 risingwave/src/frontend/src/utils/with_options.rs Lines 177 to 206 in 975bfac
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the overall idea LGTM, can we also cover some struct from sdk, such as
This approach is more universal, which can solve the redaction of structs in connector crate as well. WIP |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Review tips:
src/utils/redaction/redaction/src/lib.rs
defines a traitMarkRedaction
. It simply returns property names that need redaction.src/utils/redaction/redaction_derive/src/lib.rs
defines a derive macro to implementMarkRedaction
.src/frontend/src/utils/redaction.rs
provides function to redact SQL.Basic idea:
To make it easier to redact sensitive connector properties from table definition when displaying,
MarkRedaction
trait for all connector properties types, using the new derive macro. Developers are responsible for marking properties that should be redacted. In this PR I marked AK/SK of S3Properties for example.try_redact_sql
invokesMarkRedaction::marks
to get all property names that should be redacted. Seesrc/frontend/src/utils/redaction.rs
.src/utils/redaction/redaction_derive/src/lib.rs
.An example from
test_redact_s3_source
.Original SQL:
After redaction:
#14115
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.